Electricity Bills Analysis - ENCEVI 2018

##          used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
## Ncells 516468 27.6    1160092   62         NA   631060 33.8
## Vcells 963671  7.4    8388608   64      16384  1761712 13.5
<style type="text/css">
 img {   background-color:transparent;   border:0; }  table.st-table > thead > tr {    background-color: #eeeeee; }  table.st-table th {   text-align: center; }  table.st-table td span {   display: block; }  .st-container {   width: 100%;   padding-right: 15px;   padding-left: 15px;   margin-right: auto;   margin-left: auto;   margin-top: 15px; }  .st-multiline {   white-space: pre; }  .st-table {     width: auto;     table-layout: auto;     margin-top: 20px;     margin-bottom: 20px;     max-width: 100%;     background-color: transparent;     border-collapse: collapse; }  .st-table > thead > tr > th, .st-table > tbody > tr > th, .st-table > tfoot > tr > th, .st-table > thead > tr > td, .st-table > tbody > tr > td, .st-table > tfoot > tr > td {   vertical-align: middle; }  .st-table-bordered {   border: 1px solid #bbbbbb; }  .st-table-bordered > thead > tr > th, .st-table-bordered > tbody > tr > th, .st-table-bordered > tfoot > tr > th, .st-table-bordered > thead > tr > td, .st-table-bordered > tbody > tr > td, .st-table-bordered > tfoot > tr > td {   border: 1px solid #cccccc; }  .st-table-bordered > thead > tr > th, .st-table-bordered > thead > tr > td, .st-table thead > tr > th {   border-bottom: none; }  .st-table td, .st-table th {   padding: 8px; }  .st-table > thead > tr {    background-color: #eeeeee; }  .st-table-bordered > thead > tr > th, .st-table-bordered > tbody > tr > th, .st-table-bordered > thead > tr > td, .st-table-bordered > tbody > tr > td {   border: 1px solid #cccccc; }   .st-table-striped > tbody > tr:nth-of-type(odd) {   background-color: #ffffff; }  .st-table-striped > tbody > tr:nth-of-type(even) {   background-color: #f8f8f8; }  .st-table-striped > tbody > tr:nth-of-type(odd) {   background-color: #ffffff; }  .st-table-striped > tbody > tr:nth-of-type(even) {   background-color: #f9f9f9; }   .st-descr-table > thead > tr > th, .st-descr-table > tbody > tr > th, .st-descr-table > tfoot > tr > th, .st-descr-table > thead > tr > td, .st-descr-table > tbody > tr > td, .st-descr-table > tfoot > tr > td {   padding-left:  24px;   padding-right: 24px;   word-wrap: break-word; }  .st-descr-table td {   text-align: right; }  .st-freq-table, .st-freq-table-nomiss, .st-cross-table {   border: medium none; }  .st-freq-table  > thead > tr:nth-child(1) > th:nth-child(1), .st-cross-table > thead > tr:nth-child(1) > th:nth-child(1), .st-cross-table > thead > tr:nth-child(1) > th:nth-child(3) {   border:none;   background-color: #ffffff;   text-align: center; }  .st-freq-table > thead > tr > th, .st-freq-table > tbody > tr > th, .st-freq-table > tfoot > tr > th, .st-freq-table > thead > tr > td, .st-freq-table > tbody > tr > td, .st-freq-table > tfoot > tr > td, .st-freq-table-nomiss > thead > tr > th, .st-freq-table-nomiss > tbody > tr > th, .st-freq-table-nomiss > tfoot > tr > th, .st-freq-table-nomiss > thead > tr > td, .st-freq-table-nomiss > tbody > tr > td, .st-freq-table-nomiss > tfoot > tr > td, .st-cross-table > thead > tr > th, .st-cross-table > tbody > tr > th, .st-cross-table > tfoot > tr > th, .st-cross-table > thead > tr > td, .st-cross-table > tbody > tr > td, .st-cross-table > tfoot > tr > td {   padding-left:  20px;   padding-right: 20px; }   .st-protect-top-border {   border-top:1px solid #cccccc !important; }  .st-cross-table td {   text-align: center; }  .st-small {   font-size: 13px; }  .st-small td, .st-small th {   padding: 8px; }  .st-small > thead > tr > th, .st-small > tbody > tr > th, .st-small > tfoot > tr > th, .st-small > thead > tr > td, .st-small > tbody > tr > td, .st-small > tfoot > tr > td {   padding-left:  12px;   padding-right: 12px; } </style>

Loading datasets

Cleaning and merging dataframes

Descriptive Analysis

Dwellings surveyed per region

- Number of dwellings surveyed: 28,953
- Number of dwellings represented by the survey: 33,162,148, standard error: 151,533.1

Dwellings connected to the grid

  • Number of dwellings connected to the grid: 32,808,508*
  Freq % % Cum.
0 353640.00 1.07 1.07
1 32808508.00 98.93 100.00
Total 33162148.00 100.00 100.00

Dwellings connected to the grid by region
Number of dwellings connected to the grid by region

Electricity Bill Dates

df.enc.dwell$bill.ini.date1 <- NA
df.enc.dwell$bill.end.date1 <- NA

# The data was obtained from January 2018 to June 2018.
# So the bills could be from 2017 or 2018. As the survey does not report 
# the year of the electricity bill, it is calculated here
df.enc.dwell$year_ini1 <- "2018"
df.enc.dwell$year_end1 <- "2018"

# Case 1. If the initial period of the bill is from june to december, 
# it is assumed that the year of the initial period is 2017
df.enc.dwell$year_ini1[df.enc.dwell$mes_inic1 >= 6] <- 2017

# Case 2. If the final period of the bill is from july to december, 
# it is assumed that the year of the final period is 2017
df.enc.dwell$year_end1[df.enc.dwell$mes_final1 >= 7] <- 2017

# Case 3. If the initia month period of the bill is higher than the final month, 
# it is assumed that the year of the initial period is 2017
df.enc.dwell$year_ini1[df.enc.dwell$mes_inic1 > df.enc.dwell$mes_final1] <- 2017

df.enc.dwell$bill.ini.date1 <-  str_replace_all(paste(df.enc.dwell$mes_inic1, "-", df.enc.dwell$inicia1, "-", 
                                          df.enc.dwell$year_ini1), pattern=" ", repl="")

df.enc.dwell$bill.end.date1 <-  str_replace_all(paste(df.enc.dwell$mes_final1, "-", df.enc.dwell$final1, "-", 
                                          df.enc.dwell$year_end1), pattern=" ", repl="")

df.enc.dwell$bill.ini.date1 <- as.Date(as.character(df.enc.dwell$bill.ini.date1), 
                           format="%m-%d-%Y")

df.enc.dwell$bill.end.date1 <- as.Date(as.character(df.enc.dwell$bill.end.date1), 
                           format="%m-%d-%Y")

df.enc.dwell$bill.days1 <- as.integer((df.enc.dwell$bill.end.date1 - df.enc.dwell$bill.ini.date1))

# After obtaining the number of days of the period in the electricity bill. 
# There are still some special cases that are corrected here.
# Case 4. If the period in the electricity bill is longer than 1 year, the 
# initial year is assumed to be 2018
df.enc.dwell$year_ini1[df.enc.dwell$bill.days1 >= 365 & 
                          (df.enc.dwell$mes_inic1 <= df.enc.dwell$mes_final1)] <- 2018

# Case 5. If the period in the electricity bill is negative, the initial 
# year is assumed to be 2017
df.enc.dwell$year_ini1[df.enc.dwell$bill.days1 < 0] <- 2017

df.enc.dwell$bill.end.date1 <-  str_replace_all(paste(df.enc.dwell$mes_final1, "-", df.enc.dwell$final1, "-", 
                                          df.enc.dwell$year_end1), pattern=" ", repl="")
df.enc.dwell$bill.ini.date1 <-  str_replace_all(paste(df.enc.dwell$mes_inic1, "-" , df.enc.dwell$inicia1, "-", 
                                          df.enc.dwell$year_ini1), pattern=" ", repl="")

df.enc.dwell$bill.end.date1 <-  str_replace_all(paste(df.enc.dwell$mes_final1, "-" , df.enc.dwell$final1, "-", 
                                          df.enc.dwell$year_end1), pattern=" ", repl="")

df.enc.dwell$bill.ini.date1 <- as.Date(as.character(df.enc.dwell$bill.ini.date1), format="%m-%d-%Y")

df.enc.dwell$bill.end.date1 <- as.Date(as.character(df.enc.dwell$bill.end.date1), format="%m-%d-%Y")

df.enc.dwell$bill.days1 <- as.integer((df.enc.dwell$bill.end.date1 - df.enc.dwell$bill.ini.date1))

pl.hist.bill.days1 <- plot_ly(x = df.enc.dwell$bill.days1,
             type = "histogram",
             histnorm = "probability")%>%
  layout(title = 'Distribution of Days Billed by CFE',
         xaxis = list(title = "days", range = c(0, 350)), 
         yaxis = list(title = "probability"))

pl.hist.bill.days1
### Descriptive Statistics  
#### df.enc.dwell$bill.days1  
**N:** 28953  

|         &nbsp; |  Mean | Median | Std.Dev. |    Min |    Max |  N.Valid | % Valid |
|---------------:|------:|-------:|---------:|-------:|-------:|---------:|--------:|
| **bill.days1** | 56.85 |  61.00 |    16.64 | -61.00 | 364.00 | 11231.00 |   38.79 |

Table: Table continues below

 

|         &nbsp; | percent.valid |
|---------------:|--------------:|
| **bill.days1** |         38.79 |
  • Only : 11,231 households surveyed (38.7904535%) provided information about their bills.

The Federal Electricity Commission (CFE) bills their residential clients in a monthly or bi-monthly fashion.
The period of time between billings for monthly accounts is between 27-35 days and the period between billings for bi-monthly accounts is between 56-65 days. So, in this analysis we discarted all the bills outside these two ranges. Bills that are outside these ranges usually include special fees, like late fees or installation fees.

Each bill includes a 16% value added tax (known as IVA), and a 8% tax (DAP) that is used to pay the public street lighting system in Mexico.

If a user has a high consumption, will pay also a fixed monthly fee, that is different for each tariff.

### Descriptive Statistics  
#### df.enc.dwell$bill.days1.valid  
**N:** 28953  

|               &nbsp; |  Mean | Median | Std.Dev. |   Min |   Max |  N.Valid | % Valid |
|---------------------:|------:|-------:|---------:|------:|------:|---------:|--------:|
| **bill.days1.valid** | 56.16 |  61.00 |    11.02 | 27.00 | 65.00 | 10676.00 |   36.87 |

Calculating days of consumption of 2nd electricity bill (users that pay two electricity bills each period)

df.enc.dwell$bill.ini.date2 <- NA
df.enc.dwell$bill.end.date2 <- NA

# The data was obtained from January 2018 to June 2018.
# So the bills could be from 2017 or 2018. As the survey does not report 
# the year of the electricity bill, it is calculated here
df.enc.dwell$year_ini2 <- "2018"
df.enc.dwell$year_end2 <- "2018"

# Case 2. If the initial period of the bill is from june to december, 
# it is assumed that the year of the initial period is 2017
df.enc.dwell$year_ini2[df.enc.dwell$mes_inic2 >= 6] <- 2017

# Case 2. If the final period of the bill is from july to december, 
# it is assumed that the year of the final period is 2017
df.enc.dwell$year_end2[df.enc.dwell$mes_final2 >= 7] <- 2017

# Case 3. If the initia month period of the bill is higher than the final month, 
# it is assumed that the year of the initial period is 2017
df.enc.dwell$year_ini2[df.enc.dwell$mes_inic2 > df.enc.dwell$mes_final2] <- 2017

df.enc.dwell$bill.ini.date2 <-  str_replace_all(paste(df.enc.dwell$mes_inic2, "-", df.enc.dwell$inicia2, "-", 
                                          df.enc.dwell$year_ini2), pattern=" ", repl="")

df.enc.dwell$bill.end.date2 <-  str_replace_all(paste(df.enc.dwell$mes_final2, "-", df.enc.dwell$final2, "-", 
                                          df.enc.dwell$year_end2), pattern=" ", repl="")

df.enc.dwell$bill.ini.date2 <- as.Date(as.character(df.enc.dwell$bill.ini.date2), 
                           format="%m-%d-%Y")

df.enc.dwell$bill.end.date2 <- as.Date(as.character(df.enc.dwell$bill.end.date2), 
                           format="%m-%d-%Y")

df.enc.dwell$bill.days2 <- as.integer((df.enc.dwell$bill.end.date2 - df.enc.dwell$bill.ini.date2))

# After obtaining the number of days of the period in the electricity bill. 
# There are still some special cases that are corrected here.
# Case 4. If the period in the electricity bill is longer than 2 year, the 
# initial year is assumed to be 2018
df.enc.dwell$year_ini2[df.enc.dwell$bill.days2 >= 365 & 
                          (df.enc.dwell$mes_inic2 <= df.enc.dwell$mes_final2)] <- 2018

# Case 5. If the period in the electricity bill is negative, the initial 
# year is assumed to be 2017
df.enc.dwell$year_ini2[df.enc.dwell$bill.days2 < 0] <- 2017

df.enc.dwell$bill.end.date2 <-  str_replace_all(paste(df.enc.dwell$mes_final2, "-", df.enc.dwell$final2, "-", 
                                          df.enc.dwell$year_end2), pattern=" ", repl="")
df.enc.dwell$bill.ini.date2 <-  str_replace_all(paste(df.enc.dwell$mes_inic2, "-" , df.enc.dwell$inicia2, "-", 
                                          df.enc.dwell$year_ini2), pattern=" ", repl="")

df.enc.dwell$bill.end.date2 <-  str_replace_all(paste(df.enc.dwell$mes_final2, "-" , df.enc.dwell$final2, "-", 
                                          df.enc.dwell$year_end2), pattern=" ", repl="")

df.enc.dwell$bill.ini.date2 <- as.Date(as.character(df.enc.dwell$bill.ini.date2), format="%m-%d-%Y")

df.enc.dwell$bill.end.date2 <- as.Date(as.character(df.enc.dwell$bill.end.date2), format="%m-%d-%Y")

df.enc.dwell$bill.days2 <- as.integer((df.enc.dwell$bill.end.date2 - df.enc.dwell$bill.ini.date2))

stat.bill.days2 <- descr(df.enc.dwell$bill.days2, style = "rmarkdown", transpose = TRUE, 
            stats = c("mean", "med", "sd", "min", "max", "n.valid", "pct.valid"), 
                   headings = TRUE)

print(stat.bill.days2,
      footnote = "<b>Source:</b> INEGI, 2019<br/><i>ENCEVI 2018:</i>")

Descriptive Statistics

df.enc.dwell$bill.days2

N: 28953

  Mean Median Std.Dev. Min Max N.Valid % Valid
bill.days2 55.46 60.00 14.35 0.00 92.00 106.00 0.37
### Descriptive Statistics  
#### df.enc.dwell$bill.days2.valid  
**N:** 28953  

|               &nbsp; |  Mean | Median | Std.Dev. |   Min |   Max | N.Valid | % Valid |
|---------------------:|------:|-------:|---------:|------:|------:|--------:|--------:|
| **bill.days2.valid** | 55.52 |  60.00 |    11.35 | 28.00 | 65.00 |   97.00 |    0.34 |

For this analysis, the records of dwellings that are not connected to the grid are not considered. There are other factors that will be considered later the analysis of tariffs, such as users that have a small business in their properties

Weighted Frequencies

df.enc.dwell$local_com

Weights: factor_sem

  Freq % % Cum.
1 2046003.00 6.24 6.24
2 30762505.00 93.76 100.00
Total 32808508.00 100.00 100.00

Mauricio Hernandez

2019-06-05